Picture for Haodong Li

Haodong Li

SIGMA: Semantic-Difference Instruction-Grounding Mask Annotator for Text-Driven Image Manipulation Localization

Add code
May 27, 2026
Viaarxiv icon

Repurposing and Evaluating the (In)Feasibility of Dataset Poisoning enabled Watermarking for Contrastive Learning

Add code
May 03, 2026
Viaarxiv icon

SpatialEvo: Self-Evolving Spatial Intelligence via Deterministic Geometric Environments

Add code
Apr 15, 2026
Viaarxiv icon

AgentFoX: LLM Agent-Guided Fusion with eXplainability for AI-Generated Image Detection

Add code
Mar 24, 2026
Viaarxiv icon

PEARL: Personalized Streaming Video Understanding Model

Add code
Mar 20, 2026
Viaarxiv icon

DVD: Deterministic Video Depth Estimation with Generative Priors

Add code
Mar 12, 2026
Viaarxiv icon

WebVR: Benchmarking Multimodal LLMs for WebPage Recreation from Videos via Human-Aligned Visual Rubrics

Add code
Mar 11, 2026
Viaarxiv icon

CoCo: Code as CoT for Text-to-Image Preview and Rare Concept Generation

Add code
Mar 09, 2026
Viaarxiv icon

UniM: A Unified Any-to-Any Interleaved Multimodal Benchmark

Add code
Mar 05, 2026
Viaarxiv icon

Proact-VL: A Proactive VideoLLM for Real-Time AI Companions

Add code
Mar 03, 2026
Viaarxiv icon